31 research outputs found

    Extraction and representation of semantic information in digital media

    Get PDF

    Perceptual-based textures for scene labeling: a bottom-up and a top-down approach

    Get PDF
    Due to the semantic gap, the automatic interpretation of digital images is a very challenging task. Both the segmentation and classification are intricate because of the high variation of the data. Therefore, the application of appropriate features is of utter importance. This paper presents biologically inspired texture features for material classification and interpreting outdoor scenery images. Experiments show that the presented texture features obtain the best classification results for material recognition compared to other well-known texture features, with an average classification rate of 93.0%. For scene analysis, both a bottom-up and top-down strategy are employed to bridge the semantic gap. At first, images are segmented into regions based on the perceptual texture and next, a semantic label is calculated for these regions. Since this emerging interpretation is still error prone, domain knowledge is ingested to achieve a more accurate description of the depicted scene. By applying both strategies, 91.9% of the pixels from outdoor scenery images obtained a correct label

    Semantic web technologies for video surveillance metadata

    Get PDF
    Video surveillance systems are growing in size and complexity. Such systems typically consist of integrated modules of different vendors to cope with the increasing demands on network and storage capacity, intelligent video analytics, picture quality, and enhanced visual interfaces. Within a surveillance system, relevant information (like technical details on the video sequences, or analysis results of the monitored environment) is described using metadata standards. However, different modules typically use different standards, resulting in metadata interoperability problems. In this paper, we introduce the application of Semantic Web Technologies to overcome such problems. We present a semantic, layered metadata model and integrate it within a video surveillance system. Besides dealing with the metadata interoperability problem, the advantages of using Semantic Web Technologies and the inherent rule support are shown. A practical use case scenario is presented to illustrate the benefits of our novel approach

    Noise- and compression-robust biological features for texture classification

    Get PDF
    Texture classification is an important aspect of many digital image processing applications such as surface inspection, content-based image retrieval, and biomedical image analysis. However, noise and compression artifacts in images cause problems for most texture analysis methods. This paper proposes the use of features based on the human visual system for texture classification using a semisupervised, hierarchical approach. The texture feature consists of responses of cells which are found in the visual cortex of higher primates. Classification experiments on different texture libraries indicate that the proposed features obtain a very high classification near 97%. In contrast to other well-established texture analysis methods, the experiments indicate that the proposed features are more robust to various levels of speckle and Gaussian noise. Furthermore, we show that the classification rate of the textures using the presented biologically inspired features is hardly affected by image compression techniques

    Intelligent pre-processing for fast-moving object detection

    Get PDF
    Detection and segmentation of objects of interest in image sequences is the first major processing step in visual surveillance applications. The outcome is used for further processing, such as object tracking, interpretation, and classification of objects and their trajectories. To speed up the algorithms for moving object detection, many applications use techniques such as frame rate reduction. However, temporal consistency is an important feature in the analysis of surveillance video, especially for tracking objects. Another technique is the downscaling of the images before analysis, after which the images are up-sampled to regain the original size. This method, however, increases the effect of false detections. We propose a different pre-processing step in which we use a checkerboard-like mask to decide which pixels to process. For each frame the mask is inverted to avoid that certain pixel positions are never analyzed. In a post-processing step we use spatial interpolation to predict the detection results for the pixels which were not analyzed. To evaluate our system we have combined it with a background subtraction technique based on a mixture of Gaussian models. Results show that the models do not get corrupted by using our mask and we can reduce the processing time with over 45% while achieving similar detection results as the conventional technique

    The MAMI Query-By-Voice Experiment: Collecting and annotating vocal queries for music information retrieval

    Get PDF
    The MIR research community requires coordinated strategies in dealing with databases for system development and experimentation. Manual annotated files can accelerate the development of accurate analysis tools for music information retrieval. This paper presents background information on an annotated database of vocal queries that is freely available on the Internet. First we outline the design and set up of the experiment through which the vocal queries were generated. Then attention is drawn to the manual annotation for the vocal queries
    corecore